{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'\\n    @Author: King\\n    @Date: 2019.06.25\\n    @Purpose: Graph Convolutional Network\\n    @Introduction:   \\n    @Datasets: \\n    @Link : \\n    @Reference : https://docs.dgl.ai/tutorials/models/1_gnn/1_gcn.html\\n'"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "'''\n",
    "    @Author: King\n",
    "    @Date: 2019.06.25\n",
    "    @Purpose: Graph Convolutional Network\n",
    "    @Introduction:   This is a gentle introduction of using DGL to implement \n",
    "                    Graph Convolutional Networks (Kipf & Welling et al., \n",
    "                    Semi-Supervised Classification with Graph Convolutional Networks). \n",
    "                    We build upon the earlier tutorial on DGLGraph and demonstrate how DGL \n",
    "                    combines graph with deep neural network and learn structural representations.\n",
    "    @Datasets: \n",
    "    @Link : \n",
    "    @Reference : https://docs.dgl.ai/tutorials/models/1_gnn/1_gcn.html\n",
    "'''"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Graph Convolutional Network\n",
    "\n",
    "## Model Overview\n",
    "\n",
    "### GCN from the perspective of message passing\n",
    "\n",
    "We describe a layer of graph convolutional neural network from a message passing perspective; the math can be found here(https://docs.dgl.ai/tutorials/models/1_gnn/1_gcn.html#math). It boils down to the following step, for each node u:\n",
    "\n",
    "\n",
    "1) Aggregate neighbors’ representations $h_{v}$ to produce an intermediate representation $\\hat{h}_{u}$.\n",
    "2) Transform the aggregated representation $\\hat{h}_{u}$ with a linear projection followed by a non-linearity: $h_{u}=f\\left(W_{u} \\hat{h}_{u}\\right)$.\n",
    "\n",
    "We will implement step 1 with DGL message passing, and step 2 with the apply_nodes method, whose node UDF will be a PyTorch nn.Module.\n",
    "\n",
    "\n",
    "### GCN implementation with DGL\n",
    "\n",
    "We first define the message and reduce function as usual. Since the aggregation on a node u only involves summing over the neighbors’ representations $h_{v}$, \n",
    "(我们像往常一样定义message and reduce function。由于节点uu上的聚合仅涉及对邻居的表示 $h_{v}$ 求和)\n",
    "\n",
    "we can simply use builtin functions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import dgl\n",
    "import dgl.function as fn\n",
    "import torch as th\n",
    "import torch.nn as nn\n",
    "import torch.nn.functional as F\n",
    "from dgl import DGLGraph\n",
    "\n",
    "gcn_msg = fn.copy_src(src='h', out='m')\n",
    "gcn_reduce = fn.sum(msg='m', out='h')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We then define the node UDF for apply_nodes, which is a fully-connected layer:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "class NodeApplyModule(nn.Module):\n",
    "    def __init__(self, in_feats, out_feats, activation):\n",
    "        super(NodeApplyModule, self).__init__()\n",
    "        self.linear = nn.Linear(in_feats, out_feats)\n",
    "        self.activation = activation\n",
    "\n",
    "    def forward(self, node):\n",
    "        h = self.linear(node.data['h'])\n",
    "        h = self.activation(h)\n",
    "        return {'h' : h}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We then proceed to define the GCN module. A GCN layer essentially performs message passing on all the nodes then applies the NodeApplyModule. Note that we omitted the dropout in the paper for simplicity.\n",
    "\n",
    "然后我们继续定义GCN模块。 GCN层实质上在所有节点上执行消息传递，然后应用NodeApplyModule。 请注意，为简单起见，我们省略了文中的丢失。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "class GCN(nn.Module):\n",
    "    def __init__(self, in_feats, out_feats, activation):\n",
    "        super(GCN, self).__init__()\n",
    "        self.apply_mod = NodeApplyModule(in_feats, out_feats, activation)\n",
    "\n",
    "    def forward(self, g, feature):\n",
    "        g.ndata['h'] = feature\n",
    "        g.update_all(gcn_msg, gcn_reduce)\n",
    "        g.apply_nodes(func=self.apply_mod)\n",
    "        return g.ndata.pop('h')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The forward function is essentially the same as any other commonly seen NNs model in PyTorch. \n",
    "\n",
    "We can initialize GCN like any nn.Module. For example, let’s define a simple neural network consisting of two GCN layers. Suppose we are training the classifier for the cora dataset (the input feature size is 1433 and the number of classes is 7).\n",
    "\n",
    "我们可以像任何nn.Module一样初始化GCN。例如，让我们定义一个由两个GCN层组成的简单神经网络。假设我们正在训练cora数据集的分类器（输入要素大小为1433，类的数量为7）。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Net(\n",
      "  (gcn1): GCN(\n",
      "    (apply_mod): NodeApplyModule(\n",
      "      (linear): Linear(in_features=1433, out_features=16, bias=True)\n",
      "    )\n",
      "  )\n",
      "  (gcn2): GCN(\n",
      "    (apply_mod): NodeApplyModule(\n",
      "      (linear): Linear(in_features=16, out_features=7, bias=True)\n",
      "    )\n",
      "  )\n",
      ")\n"
     ]
    }
   ],
   "source": [
    "class Net(nn.Module):\n",
    "    def __init__(self):\n",
    "        super(Net, self).__init__()\n",
    "        self.gcn1 = GCN(1433, 16, F.relu)\n",
    "        self.gcn2 = GCN(16, 7, F.relu)\n",
    "\n",
    "    def forward(self, g, features):\n",
    "        x = self.gcn1(g, features)\n",
    "        x = self.gcn2(g, x)\n",
    "        return x\n",
    "net = Net()\n",
    "print(net)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We load the cora dataset using DGL’s built-in data module."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "from dgl.data import citation_graph as citegrh\n",
    "def load_cora_data():\n",
    "    data = citegrh.load_cora()\n",
    "    features = th.FloatTensor(data.features)\n",
    "    labels = th.LongTensor(data.labels)\n",
    "    mask = th.ByteTensor(data.train_mask)\n",
    "    g = data.graph\n",
    "    # add self loop\n",
    "    g.remove_edges_from(g.selfloop_edges())\n",
    "    g = DGLGraph(g)\n",
    "    g.add_edges(g.nodes(), g.nodes())\n",
    "    return g, features, labels, mask"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We then train the network as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Downloading C:\\Users\\ASUS\\.dgl/cora.zip from https://s3.us-east-2.amazonaws.com/dgl.ai/dataset/cora_raw.zip...\n",
      "Extracting file to C:\\Users\\ASUS\\.dgl/cora\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "D:\\progrom\\python\\python\\python3\\lib\\site-packages\\dgl\\base.py:18: UserWarning: Initializer is not set. Use zero initializer instead. To suppress this warning, use `set_initializer` to explicitly specify which initializer to use.\n",
      "  warnings.warn(msg)\n",
      "D:\\progrom\\python\\python\\python3\\lib\\site-packages\\numpy\\core\\fromnumeric.py:3118: RuntimeWarning: Mean of empty slice.\n",
      "  out=out, **kwargs)\n",
      "D:\\progrom\\python\\python\\python3\\lib\\site-packages\\numpy\\core\\_methods.py:85: RuntimeWarning: invalid value encountered in double_scalars\n",
      "  ret = ret.dtype.type(ret / rcount)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 00000 | Loss 1.9448 | Time(s) nan\n",
      "Epoch 00001 | Loss 1.9262 | Time(s) nan\n",
      "Epoch 00002 | Loss 1.9091 | Time(s) nan\n",
      "Epoch 00003 | Loss 1.8922 | Time(s) 0.3211\n",
      "Epoch 00004 | Loss 1.8756 | Time(s) 0.3181\n",
      "Epoch 00005 | Loss 1.8603 | Time(s) 0.3191\n",
      "Epoch 00006 | Loss 1.8452 | Time(s) 0.3221\n",
      "Epoch 00007 | Loss 1.8301 | Time(s) 0.3209\n",
      "Epoch 00008 | Loss 1.8152 | Time(s) 0.3211\n",
      "Epoch 00009 | Loss 1.8005 | Time(s) 0.3253\n",
      "Epoch 00010 | Loss 1.7861 | Time(s) 0.3236\n",
      "Epoch 00011 | Loss 1.7720 | Time(s) 0.3224\n",
      "Epoch 00012 | Loss 1.7585 | Time(s) 0.3214\n",
      "Epoch 00013 | Loss 1.7454 | Time(s) 0.3218\n",
      "Epoch 00014 | Loss 1.7324 | Time(s) 0.3231\n",
      "Epoch 00015 | Loss 1.7196 | Time(s) 0.3230\n",
      "Epoch 00016 | Loss 1.7070 | Time(s) 0.3219\n",
      "Epoch 00017 | Loss 1.6946 | Time(s) 0.3234\n",
      "Epoch 00018 | Loss 1.6826 | Time(s) 0.3228\n",
      "Epoch 00019 | Loss 1.6709 | Time(s) 0.3250\n",
      "Epoch 00020 | Loss 1.6596 | Time(s) 0.3260\n",
      "Epoch 00021 | Loss 1.6485 | Time(s) 0.3272\n",
      "Epoch 00022 | Loss 1.6377 | Time(s) 0.3301\n",
      "Epoch 00023 | Loss 1.6269 | Time(s) 0.3297\n",
      "Epoch 00024 | Loss 1.6161 | Time(s) 0.3290\n",
      "Epoch 00025 | Loss 1.6051 | Time(s) 0.3284\n",
      "Epoch 00026 | Loss 1.5939 | Time(s) 0.3286\n",
      "Epoch 00027 | Loss 1.5828 | Time(s) 0.3297\n",
      "Epoch 00028 | Loss 1.5716 | Time(s) 0.3295\n",
      "Epoch 00029 | Loss 1.5607 | Time(s) 0.3296\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "import numpy as np\n",
    "g, features, labels, mask = load_cora_data()\n",
    "optimizer = th.optim.Adam(net.parameters(), lr=1e-3)\n",
    "dur = []\n",
    "for epoch in range(30):\n",
    "    if epoch >=3:\n",
    "        t0 = time.time()\n",
    "\n",
    "    logits = net(g, features)\n",
    "    logp = F.log_softmax(logits, 1)\n",
    "    loss = F.nll_loss(logp[mask], labels[mask])\n",
    "\n",
    "    optimizer.zero_grad()\n",
    "    loss.backward()\n",
    "    optimizer.step()\n",
    "\n",
    "    if epoch >=3:\n",
    "        dur.append(time.time() - t0)\n",
    "\n",
    "    print(\"Epoch {:05d} | Loss {:.4f} | Time(s) {:.4f}\".format(\n",
    "            epoch, loss.item(), np.mean(dur)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### GCN in one formula\n",
    "\n",
    "Mathematically, the GCN model follows this formula:\n",
    "\n",
    "$$\n",
    "H^{(l+1)}=\\sigma\\left(\\tilde{D}^{-\\frac{1}{2}} \\tilde{A} \\tilde{D}^{-\\frac{1}{2}} H^{(l)} W^{(l)}\\right)\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
